Introduction

In the US, attending college is a true financial investment. Students take out hundreds of thousands of dollars in student loans, to attend school and earn their degree. According to studentloanhero.com, 69% of the graduating class of 2018 left school with student loans, with an average debt of $29,800. Altogether, American citizens “owe over $1.56 trillion in student loan debt, spread out among about 45 million borrowers” (studentloanhero.com).

This crisis centers around money: paying for college as well as the ability to pay off student debt after graduation.

To better understand the statistics and relationships between factors of these costs and salaries, we found a dataset that could provide financial information about students and costs as well as extensive information about the schools themselves. This provides greater insight into the possible driving factors behind published statistics like median starting salary and school tuition prices.

The Dataset

Utilizing the publicly available and widely trusted source, U.S. News 2019 Best National Univeristy ranking list, we were able to access information about top schools that was organized in a digestable and scrapable manner.

We obtained information regarding 12 different variables (using the URL and School Name to uniquely identify the Univerisities):

  • Tuition Cost (in USD - out of state tuition used if university is a state school)
  • Room & Board (in USD)
  • Total Enrollment
  • School Type (Private/Public)
  • Year Founded
  • Setting
  • Endowment 2017
  • Median Starting Salary Of Alumni (3 years postgraduate)
  • Selectivity
  • Fall 2017 Acceptance Rate
  • Male Percentage
  • Four Year Graduation Rate

We chose to use the top 110, removing one row, because of data inconsistency, as it did not provide a Median Starting Salary for Alumni. This was the Univerity of California - Santa Barbara.

Research Questions

  • What variables are statistically significant in predicting college tuition prices?
    • Which factors seem to drive the biggest increases?
    • Does a higher endowment mean that students have to spend less on tuition? (i.e. does it appear that schools use their endowments to supplement tuition prices at all?)
  • What variables are statistically significant in predicting median starting salary for alumni (with 3 years of experience post-grad)?
    • Is the cost of undergraduate tuition a key factor? (i.e. spending more on education correlated with getting more money later?)
    • Is median starting salary correlated with the male to female ratio at the institution in any kind of way?

Required Tools

You will use R and the following libraries:

  • ggplot2
  • rvest
  • tidyverse
  • stringr

<<<<<<< HEAD

Required Tools

You will use R and the following libraries: -ggplot2 -rvest -tidyverse -stringr

Part 1: Data Scraping

=======

Part 1: Data Scraping

>>>>>>> 0604266734446cc7af126243c4a2a0b17213beeb

The main url we will be using contains very limited information about each of the schools, such as ranking and tuition, therefore the first step that needs to be taken to be able to gain all the information we need to be able to analyze the data and make predictions is to parse the information into readable data. Detailed information about each school is spread across multiple websites so we will need to retrieve the proper url for each university from the US News website containing the ranking and then parse important information into tables that could be used for data analysis.

We are scraping the data of 100 schools from https://www.usnews.com/best-colleges/rankings/national-universities. The data we have is stored in a text file since it loads on the page in increments. We parse the data to find the URL for each college’s informational page.

Note: The information for the University of California–Davis was removed from the dataset because it didn’t contain median alumni salary, which plays a large role in our analysis.

Note: Room and Board, Tuition and Fees, and Median Alumni Salary are all in thousands of dollars.

library(rvest)
library(tidyverse)
url <-"html_top100.txt"
college_urls <- url %>%
  read_html() %>%
  html_node("body") %>% html_nodes("ol[class~=bEyEue]") %>% html_nodes("li[id]")%>% html_nodes("h3") %>% 
  html_nodes("a[href]") %>%
  html_attr("href") 
head(college_urls)
## [1] "/best-colleges/princeton-university-2627"                 
## [2] "/best-colleges/harvard-university-2155"                   
## [3] "/best-colleges/columbia-university-2707"                  
## [4] "/best-colleges/massachusetts-institute-of-technology-2178"
## [5] "/best-colleges/university-of-chicago-1774"                
## [6] "/best-colleges/yale-university-1426"

A data frame is created to store the information of each college in rows. Columns are initialized with their corresponding default values.

index_num <- 0

college_tab_1 <-  data.frame("URL" = gsub(" ", "", paste("https://www.usnews.com",college_urls, sep = "")), 
"CollegeName"= "", "TuitionFeesThousands" = 0, "RoomBoardThousands" = 0, "TotalEnrollment" = 0, "SchoolType" = "", "YearFounded" = 0, "Setting" = "", "Endowment2017Millions" = 0, "MedianStartingSalaryOfAlumniThousands" = 0, "Selectivity" = "", "Fall2017AcceptanceRate" = 0, "MalePercentage" = 0, "FourYearGraduationRate" = 0, stringsAsFactors = FALSE) 

#removing one college that doesn't have a median starting salary, for data uniformity
college_tab_1 <- college_tab_1[-c(40),]

Below are functions used to obtain data from the website and parse it.

#retrieves of vector of size three containing the Tuition&Fees, Room&Board, and total enrollment
get_info <- function(url_html){
  attr <- url_html %>% html_node("body") %>% html_nodes("div[id~=content-main]") %>%   
    html_nodes("section[class~=hero-stats-widget-stats]") %>%
    html_nodes("ul") %>% html_nodes("li") %>% html_nodes("strong")
}

#takes in a vector and index, and parses that information to a double
#ex: $47,263 -> 47263.0
get_tuition_rm <- function(info, num){
  a_1 <- info[num] %>%  html_text()
  tuition_rm <- 
    as.double(paste(substring(a_1, 2, str_locate(a_1, ",")[1] - 1), substring(a_1, str_locate(a_1, ",")[1] + 1, str_locate(a_1, " ")[1] - 1), sep=""))
  tuition_rm / 1000.0
}

#takes in a vector and parses the total enrollment information to a double
get_enrollment <- function(info){
  a_1 <- info[3] %>%  html_text()
  as.double(paste(substring(a_1, 1, str_locate(a_1, ",")[1] - 1), substring(a_1, str_locate(a_1, ",")[1] + 1), sep=""))
}

#gets the percentage of the majority gender at a certain university
get_percent <- function(url_html){
  attr <- url_html %>% html_node("body") %>% html_nodes("div[id~=content-main]") %>%   
    html_nodes("div[class~=block-normal]") %>% html_nodes("span[class~=distribution-breakdown__percentage]") %>% html_text()
  as.double(substring(attr, 1, str_locate(attr, "%")[1] - 1)) / 100.0
}

#retrieves the gender of the majority sex and parses the percentage to be in terms of males
get_gender_ratio <- function(url_html){
  attr <- url_html %>% html_node("body") %>% html_nodes("div[id~=content-main]") %>%   
    html_nodes("div[class~=block-normal]") %>% html_nodes("span[class~=distribution-breakdown__percentage-copy]") %>% html_text()
  attr <- sub("\n                    ","",attr)
  attr <- sub("\n                ","",attr)
  if (attr == "Female"){
    1 - get_percent(url_html)
  }else{
    get_percent(url_html)
  }
}

Here, we use both the functions above and the html_node function to fill out the table.

college_tab <- college_tab_1

for (i in 1:nrow(college_tab)){
  url_html <- college_tab[i,1] %>%read_html()
  college_tab[i,]$CollegeName <- url_html %>% html_node("body") %>% html_nodes("h1[class~=hero-heading]") %>% html_text()
  priv_tuition <- url_html %>% html_node("body") %>% html_nodes("span[data-test-id~=v_private_tuition]") %>% html_text()
  college_tab[i,]$TuitionFeesThousands <- ifelse(length(priv_tuition) > 0, priv_tuition, 
                                                 url_html %>% html_node("body") %>% html_node("span[data-test-id~=v_out_state_tuition]") %>% html_text())
  college_tab[i,]$RoomBoardThousands <- url_html %>% html_node("body") %>% html_node("span[data-test-id~=w_room_board]") %>% html_text()
  college_tab[i,]$TotalEnrollment <- url_html %>% html_node("body") %>% html_node("span[data-test-id~=total_all_students]") %>% html_text()
  college_tab[i,]$MalePercentage <- get_gender_ratio(url_html)
  college_tab[i,]$Fall2017AcceptanceRate <- url_html %>% html_node("span[data-test-id~=r_c_accept_rate]") %>% html_text()
  college_tab[i,]$Selectivity <- url_html %>% html_node("span[data-test-id~=c_select_class]") %>% html_text()
  college_tab[i,]$FourYearGraduationRate <- url_html %>% html_node("span[data-test-id~=grad_rate_4_year]") %>% html_text()
  college_tab[i,]$MedianStartingSalaryOfAlumniThousands <-  url_html %>% html_nodes("div[data-field-id=averageStartSalary]") %>%html_node("span[data-test-id]") %>% html_text()
  temp_vector <- url_html %>% html_node("body") %>% html_nodes("div[id~=content-main]") %>%html_nodes("div[class~=flex-row]") %>%   html_nodes("span[class~=heading-small]") %>% html_text()
  college_tab[i,]$SchoolType <- temp_vector[1]
  college_tab[i,]$YearFounded <- temp_vector[2]
  college_tab[i,]$Setting <- temp_vector[5]
  college_tab[i,]$Endowment2017Millions  <- temp_vector[6]
}

head(college_tab)
##                                                                               URL
## 1                  https://www.usnews.com/best-colleges/princeton-university-2627
## 2                    https://www.usnews.com/best-colleges/harvard-university-2155
## 3                   https://www.usnews.com/best-colleges/columbia-university-2707
## 4 https://www.usnews.com/best-colleges/massachusetts-institute-of-technology-2178
## 5                 https://www.usnews.com/best-colleges/university-of-chicago-1774
## 6                       https://www.usnews.com/best-colleges/yale-university-1426
##                                             CollegeName
## 1                  \n        Princeton University\n    
## 2                    \n        Harvard University\n    
## 3                   \n        Columbia University\n    
## 4 \n        Massachusetts Institute of Technology\n    
## 5                 \n        University of Chicago\n    
## 6                       \n        Yale University\n    
##              TuitionFeesThousands              RoomBoardThousands
## 1 \n            $47,140 (2018-19) \n            $15,610 (2018-19)
## 2 \n            $50,420 (2018-19) \n            $17,160 (2018-19)
## 3 \n            $59,430 (2018-19) \n            $14,016 (2018-19)
## 4 \n            $51,832 (2018-19) \n            $15,510 (2018-19)
## 5 \n            $57,006 (2018-19) \n            $16,350 (2018-19)
## 6 \n            $53,430 (2018-19) \n            $16,000 (2018-19)
##        TotalEnrollment    SchoolType YearFounded  Setting
## 1  \n            8,273 Private, Coed        1746 Suburban
## 2 \n            20,604 Private, Coed        1636    Urban
## 3 \n            25,968 Private, Coed        1754    Urban
## 4 \n            11,466 Private, Coed        1861    Urban
## 5 \n            13,736 Private, Coed        1890    Urban
## 6 \n            12,974 Private, Coed        1701     City
##   Endowment2017Millions MedianStartingSalaryOfAlumniThousands
## 1         $23.4 billion                \n            $68,400*
## 2         $37.1 billion                \n            $66,500*
## 3         $10.0 billion                \n            $64,900*
## 4       $14.8 billion +                \n            $79,800*
## 5        $6.6 billion +                \n            $57,700*
## 6       $27.2 billion +                \n            $63,200*
##                    Selectivity Fall2017AcceptanceRate MalePercentage
## 1 \n            Most selective       \n            6%           0.51
## 2 \n            Most selective       \n            5%           0.52
## 3 \n            Most selective       \n            6%           0.52
## 4 \n            Most selective       \n            7%           0.54
## 5 \n            Most selective       \n            9%           0.51
## 6 \n            Most selective       \n            7%           0.50
##   FourYearGraduationRate
## 1      \n            89%
## 2      \n            84%
## 3      \n            88%
## 4      \n            85%
## 5      \n            88%
## 6      \n            87%

Below, we reformat many of the columns to get usable data. Each column is categorized into the appropriate type of data.

formatted_college_tab <- college_tab
#fix type of School Type, Setting, Year Founded
formatted_college_tab$SchoolType <- as.factor(formatted_college_tab$SchoolType)
formatted_college_tab$Setting <- as.factor(formatted_college_tab$Setting)
formatted_college_tab$YearFounded <- as.integer(formatted_college_tab$YearFounded)
#fix Endowment2017 formatting
formatted_college_tab$Endowment2017Millions  <- ifelse(grepl("billion", formatted_college_tab$Endowment2017Millions ), sub("\\.","",formatted_college_tab$Endowment2017Millions ),formatted_college_tab$Endowment2017Millions )
formatted_college_tab$Endowment2017Millions  <-sub(" billion","00",formatted_college_tab$Endowment2017Millions )
formatted_college_tab$Endowment2017Millions  <-sub(" million","",formatted_college_tab$Endowment2017Millions )
formatted_college_tab$Endowment2017Millions  <-sub("[[:punct:]]", "",formatted_college_tab$Endowment2017Millions )
formatted_college_tab$Endowment2017Millions  <-sub("\\$", "",formatted_college_tab$Endowment2017Millions )
formatted_college_tab$Endowment2017Millions  <-sub(" \\+", "",formatted_college_tab$Endowment2017Millions )
formatted_college_tab$Endowment2017Millions <- as.double(formatted_college_tab$Endowment2017Millions)
#fix College Name formatting
formatted_college_tab$CollegeName <- sub("^\n        ","",formatted_college_tab$CollegeName)
formatted_college_tab$CollegeName <-sub("\n    ","",formatted_college_tab$CollegeName)
#fixing Acceptance Rate formatting
formatted_college_tab$Fall2017AcceptanceRate <- sub("\n            ","",formatted_college_tab$Fall2017AcceptanceRate)
formatted_college_tab$Fall2017AcceptanceRate <- sub("%","",formatted_college_tab$Fall2017AcceptanceRate)
formatted_college_tab$Fall2017AcceptanceRate <- as.double(formatted_college_tab$Fall2017AcceptanceRate)
formatted_college_tab$Fall2017AcceptanceRate <- formatted_college_tab$Fall2017AcceptanceRate/100
#fixing Grad Rate formatting
formatted_college_tab$FourYearGraduationRate <- sub("\n            ","",formatted_college_tab$FourYearGraduationRate)
formatted_college_tab$FourYearGraduationRate <- sub("%","",formatted_college_tab$FourYearGraduationRate)
formatted_college_tab$FourYearGraduationRate <- as.double(formatted_college_tab$FourYearGraduationRate)
formatted_college_tab$FourYearGraduationRate <- formatted_college_tab$FourYearGraduationRate/100
#fixing Salary formatting
formatted_college_tab$MedianStartingSalaryOfAlumniThousands <- 
  sub("\n            ","",formatted_college_tab$MedianStartingSalaryOfAlumniThousands)
formatted_college_tab$MedianStartingSalaryOfAlumniThousands <- gsub("\\*","",formatted_college_tab$MedianStartingSalaryOfAlumniThousands)
formatted_college_tab$MedianStartingSalaryOfAlumniThousands <- gsub("\\$","",formatted_college_tab$MedianStartingSalaryOfAlumniThousands)
formatted_college_tab$MedianStartingSalaryOfAlumniThousands <- gsub("\\,","",formatted_college_tab$MedianStartingSalaryOfAlumniThousands)
formatted_college_tab$MedianStartingSalaryOfAlumniThousands <- as.double(formatted_college_tab$MedianStartingSalaryOfAlumniThousands)/1000
#fixing Selectivity formatting
formatted_college_tab$Selectivity <- sub("\n            ","",formatted_college_tab$Selectivity)
formatted_college_tab$Selectivity <- as.factor(formatted_college_tab$Selectivity)
#fixing Tuition formatting
formatted_college_tab$TuitionFeesThousands <- sub("\n            ", "",formatted_college_tab$TuitionFeesThousands )
formatted_college_tab$TuitionFeesThousands <- sub(" \\(2018-19\\)", "",formatted_college_tab$TuitionFeesThousands )
formatted_college_tab$TuitionFeesThousands  <-sub("\\,", "",formatted_college_tab$TuitionFeesThousands )
formatted_college_tab$TuitionFeesThousands  <-sub("\\$", "",formatted_college_tab$TuitionFeesThousands )
formatted_college_tab$TuitionFeesThousands <- as.double(formatted_college_tab$TuitionFeesThousands)/1000
## Warning: NAs introduced by coercion
#fixing RoomBoard formatting
formatted_college_tab$RoomBoardThousands <- sub("\n            ", "",formatted_college_tab$RoomBoardThousands )
formatted_college_tab$RoomBoardThousands <- sub(" \\(2018-19\\)", "",formatted_college_tab$RoomBoardThousands )
formatted_college_tab$RoomBoardThousands  <-sub("\\,", "",formatted_college_tab$RoomBoardThousands )
formatted_college_tab$RoomBoardThousands  <-sub("\\$", "",formatted_college_tab$RoomBoardThousands )
formatted_college_tab$RoomBoardThousands <- as.double(formatted_college_tab$RoomBoardThousands)/1000
## Warning: NAs introduced by coercion
#fixing Enrollment formatting
formatted_college_tab$TotalEnrollment <- sub("\n            ", "",formatted_college_tab$TotalEnrollment )
formatted_college_tab$TotalEnrollment  <-sub("\\,", "",formatted_college_tab$TotalEnrollment )
formatted_college_tab$TotalEnrollment <- as.double(formatted_college_tab$TotalEnrollment)


formatted_college_tab <- formatted_college_tab %>% mutate(TotalCostThousands =TuitionFeesThousands + RoomBoardThousands )

formatted_college_tab <- na.omit(formatted_college_tab)
nrow(formatted_college_tab)
## [1] 107
as.tibble(formatted_college_tab)
## Warning: `as.tibble()` is deprecated, use `as_tibble()` (but mind the new semantics).
## This warning is displayed once per session.
## # A tibble: 107 x 15
##    URL   CollegeName TuitionFeesThou… RoomBoardThousa… TotalEnrollment
##    <chr> <chr>                  <dbl>            <dbl>           <dbl>
##  1 http… Princeton …             47.1             15.6            8273
##  2 http… Harvard Un…             50.4             17.2           20604
##  3 http… Columbia U…             59.4             14.0           25968
##  4 http… Massachuse…             51.8             15.5           11466
##  5 http… University…             57.0             16.4           13736
##  6 http… Yale Unive…             53.4             16             12974
##  7 http… Stanford U…             51.4             15.8           17178
##  8 http… Duke Unive…             56.0             15.9           16294
##  9 http… University…             55.6             15.6           21907
## 10 http… Johns Hopk…             53.7             15.8           25151
## # … with 97 more rows, and 10 more variables: SchoolType <fct>,
## #   YearFounded <int>, Setting <fct>, Endowment2017Millions <dbl>,
## #   MedianStartingSalaryOfAlumniThousands <dbl>, Selectivity <fct>,
## #   Fall2017AcceptanceRate <dbl>, MalePercentage <dbl>,
## #   FourYearGraduationRate <dbl>, TotalCostThousands <dbl>
#to save as csv to easily work on it without having to reload
write.csv(formatted_college_tab, file = "college_info.csv")
formatted_college_tab <- read.csv("college_info.csv")
formatted_college_tab <- formatted_college_tab[,-c(1)]
<<<<<<< HEAD
as.tibble(formatted_college_tab)
## # A tibble: 107 x 15
##    URL   CollegeName TuitionFeesThou… RoomBoardThousa… TotalEnrollment
##    <fct> <fct>                  <dbl>            <dbl>           <int>
##  1 http… Princeton …             47.1             15.6            8273
##  2 http… Harvard Un…             50.4             17.2           20604
##  3 http… Columbia U…             59.4             14.0           25968
##  4 http… Massachuse…             51.8             15.5           11466
##  5 http… University…             57.0             16.4           13736
##  6 http… Yale Unive…             53.4             16             12974
##  7 http… Stanford U…             51.4             15.8           17178
##  8 http… Duke Unive…             56.0             15.9           16294
##  9 http… University…             55.6             15.6           21907
## 10 http… Johns Hopk…             53.7             15.8           25151
## # … with 97 more rows, and 10 more variables: SchoolType <fct>,
## #   YearFounded <int>, Setting <fct>, Endowment2017Millions <dbl>,
## #   MedianStartingSalaryOfAlumniThousands <dbl>, Selectivity <fct>,
## #   Fall2017AcceptanceRate <dbl>, MalePercentage <dbl>,
## #   FourYearGraduationRate <dbl>, TotalCostThousands <dbl>
======= formatted_college_tab
##                                                                                   URL
## 1                      https://www.usnews.com/best-colleges/princeton-university-2627
## 2                        https://www.usnews.com/best-colleges/harvard-university-2155
## 3                       https://www.usnews.com/best-colleges/columbia-university-2707
## 4     https://www.usnews.com/best-colleges/massachusetts-institute-of-technology-2178
## 5                     https://www.usnews.com/best-colleges/university-of-chicago-1774
## 6                           https://www.usnews.com/best-colleges/yale-university-1426
## 7                       https://www.usnews.com/best-colleges/stanford-university-1305
## 8                           https://www.usnews.com/best-colleges/duke-university-2920
## 9                https://www.usnews.com/best-colleges/university-of-pennsylvania-3378
## 10                                      https://www.usnews.com/best-colleges/jhu-2077
## 11                  https://www.usnews.com/best-colleges/northwestern-university-1739
## 12       https://www.usnews.com/best-colleges/california-institute-of-technology-1131
## 13                        https://www.usnews.com/best-colleges/dartmouth-college-2573
## 14                         https://www.usnews.com/best-colleges/brown-university-3401
## 15                               https://www.usnews.com/best-colleges/vanderbilt-3535
## 16                       https://www.usnews.com/best-colleges/cornell-university-2711
## 17                                     https://www.usnews.com/best-colleges/rice-3604
## 18                 https://www.usnews.com/best-colleges/university-of-notre-dame-1840
## 19     https://www.usnews.com/best-colleges/university-of-california-los-angeles-1315
## 20        https://www.usnews.com/best-colleges/washington-university-in-st-louis-2520
## 21                         https://www.usnews.com/best-colleges/emory-university-1564
## 22                    https://www.usnews.com/best-colleges/georgetown-university-1445
## 23        https://www.usnews.com/best-colleges/university-of-california-berkeley-1312
## 24        https://www.usnews.com/best-colleges/university-of-southern-california-1328
## 25               https://www.usnews.com/best-colleges/carnegie-mellon-university-3242
## 26                                      https://www.usnews.com/best-colleges/uva-6968
## 27                         https://www.usnews.com/best-colleges/tufts-university-2219
## 28         https://www.usnews.com/best-colleges/university-of-michigan-ann-arbor-9092
## 29                              https://www.usnews.com/best-colleges/wake-forest-2978
## 30                                      https://www.usnews.com/best-colleges/nyu-2785
## 31   https://www.usnews.com/best-colleges/university-of-california-santa-barbara-1320
## 32                                      https://www.usnews.com/best-colleges/unc-2974
## 33                  https://www.usnews.com/best-colleges/university-of-rochester-2894
## 34                      https://www.usnews.com/best-colleges/brandeis-university-2133
## 35          https://www.usnews.com/best-colleges/georgia-institute-of-technology-1569
## 36                    https://www.usnews.com/best-colleges/university-of-florida-1535
## 37                           https://www.usnews.com/best-colleges/boston-college-2128
## 38                         https://www.usnews.com/best-colleges/william-and-mary-3705
## 39       https://www.usnews.com/best-colleges/university-of-california-san-diego-1317
## 40                        https://www.usnews.com/best-colleges/boston-university-2130
## 41          https://www.usnews.com/best-colleges/case-western-reserve-university-3024
## 42                  https://www.usnews.com/best-colleges/northeastern-university-2199
## 43                        https://www.usnews.com/best-colleges/tulane-university-2029
## 44                    https://www.usnews.com/best-colleges/pepperdine-university-1264
## 45                    https://www.usnews.com/best-colleges/university-of-georgia-1598
## 46   https://www.usnews.com/best-colleges/university-of-illinois-urbanachampaign-1775
## 47                                      https://www.usnews.com/best-colleges/rpi-2803
## 48                      https://www.usnews.com/best-colleges/university-of-texas-3658
## 49                  https://www.usnews.com/best-colleges/university-of-wisconsin-3895
## 50                                https://www.usnews.com/best-colleges/villanova-3388
## 51                        https://www.usnews.com/best-colleges/lehigh-university-3289
## 52                      https://www.usnews.com/best-colleges/syracuse-university-2882
## 53                      https://www.usnews.com/best-colleges/university-of-miami-1536
## 54                               https://www.usnews.com/best-colleges/ohio-state-6883
## 55         https://www.usnews.com/best-colleges/purdue-university-west-lafayette-1825
## 56                    https://www.usnews.com/best-colleges/rutgers-new-brunswick-6964
## 57                               https://www.usnews.com/best-colleges/penn-state-6965
## 58                                      https://www.usnews.com/best-colleges/smu-3613
## 59                 https://www.usnews.com/best-colleges/university-of-washington-3798
## 60                                      https://www.usnews.com/best-colleges/wpi-2233
## 61             https://www.usnews.com/best-colleges/george-washington-university-1444
## 62                                   https://www.usnews.com/best-colleges/uconn-29013
## 63                   https://www.usnews.com/best-colleges/university-of-maryland-2103
## 64                                      https://www.usnews.com/best-colleges/byu-3670
## 65           https://www.usnews.com/best-colleges/clark-university-massachusetts-2139
## 66                       https://www.usnews.com/best-colleges/clemson-university-3425
## 67     https://www.usnews.com/best-colleges/texas-am-university-college-station-10366
## 68                 https://www.usnews.com/best-colleges/florida-state-university-1489
## 69                       https://www.usnews.com/best-colleges/fordham-university-2722
## 70          https://www.usnews.com/best-colleges/stevens-institute-of-technology-2639
## 71      https://www.usnews.com/best-colleges/university-of-california-santa-cruz-1321
## 72                            https://www.usnews.com/best-colleges/umass-amherst-2221
## 73                 https://www.usnews.com/best-colleges/university-of-pittsburgh-3379
## 74      https://www.usnews.com/best-colleges/university-of-minnesota-twin-cities-3969
## 75                            https://www.usnews.com/best-colleges/virginia-tech-3754
## 76                      https://www.usnews.com/best-colleges/american-university-1434
## 77                        https://www.usnews.com/best-colleges/baylor-university-6967
## 78                          https://www.usnews.com/best-colleges/suny-binghamton-2836
## 79                 https://www.usnews.com/best-colleges/colorado-school-of-mines-1348
## 80             https://www.usnews.com/best-colleges/north-carolina-state-raleigh-2972
## 81                         https://www.usnews.com/best-colleges/stony-brook-suny-2838
## 82                                      https://www.usnews.com/best-colleges/tcu-3636
## 83                       https://www.usnews.com/best-colleges/yeshiva-university-2903
## 84                           https://www.usnews.com/best-colleges/michigan-state-2290
## 85       https://www.usnews.com/best-colleges/university-of-california-riverside-1316
## 86                 https://www.usnews.com/best-colleges/university-of-san-diego-10395
## 87                        https://www.usnews.com/best-colleges/howard-university-1448
## 88           https://www.usnews.com/best-colleges/indiana-university-bloomington-1809
## 89                https://www.usnews.com/best-colleges/loyola-university-chicago-1710
## 90                     https://www.usnews.com/best-colleges/marquette-university-3863
## 91                                       https://www.usnews.com/best-colleges/ub-9554
## 92                   https://www.usnews.com/best-colleges/university-of-delaware-1431
## 93                       https://www.usnews.com/best-colleges/university-of-iowa-1892
## 94         https://www.usnews.com/best-colleges/illinois-institute-of-technology-1691
## 95                         https://www.usnews.com/best-colleges/miami-university-7104
## 96           https://www.usnews.com/best-colleges/university-of-colorado-boulder-1370
## 97                     https://www.usnews.com/best-colleges/university-of-denver-1371
## 98              https://www.usnews.com/best-colleges/university-of-san-francisco-1325
## 99                    https://www.usnews.com/best-colleges/university-of-vermont-3696
## 100                     https://www.usnews.com/best-colleges/clarkson-university-2699
## 101                       https://www.usnews.com/best-colleges/drexel-university-3256
## 102                                     https://www.usnews.com/best-colleges/rit-2806
## 103                    https://www.usnews.com/best-colleges/university-of-oregon-3223
## 104                                    https://www.usnews.com/best-colleges/njit-2621
## 105                     https://www.usnews.com/best-colleges/st-louis-university-2506
## 106 https://www.usnews.com/best-colleges/suny-environmental-science-and-forestry-2851
## 107                       https://www.usnews.com/best-colleges/temple-university-3371
##                                            CollegeName
## 1                                 Princeton University
## 2                                   Harvard University
## 3                                  Columbia University
## 4                Massachusetts Institute of Technology
## 5                                University of Chicago
## 6                                      Yale University
## 7                                  Stanford University
## 8                                      Duke University
## 9                           University of Pennsylvania
## 10                            Johns Hopkins University
## 11                             Northwestern University
## 12                  California Institute of Technology
## 13                                   Dartmouth College
## 14                                    Brown University
## 15                               Vanderbilt University
## 16                                  Cornell University
## 17                                     Rice University
## 18                            University of Notre Dame
## 19               University of California--Los Angeles
## 20                  Washington University in St. Louis
## 21                                    Emory University
## 22                               Georgetown University
## 23                  University of California--Berkeley
## 24                   University of Southern California
## 25                          Carnegie Mellon University
## 26                              University of Virginia
## 27                                    Tufts University
## 28                   University of Michigan--Ann Arbor
## 29                              Wake Forest University
## 30                                 New York University
## 31             University of California--Santa Barbara
## 32           University of North Carolina--Chapel Hill
## 33                             University of Rochester
## 34                                 Brandeis University
## 35                     Georgia Institute of Technology
## 36                               University of Florida
## 37                                      Boston College
## 38                         College of William and Mary
## 39                 University of California--San Diego
## 40                                   Boston University
## 41                     Case Western Reserve University
## 42                             Northeastern University
## 43                                   Tulane University
## 44                               Pepperdine University
## 45                               University of Georgia
## 46            University of Illinois--Urbana-Champaign
## 47                    Rensselaer Polytechnic Institute
## 48                         University of Texas--Austin
## 49                    University of Wisconsin--Madison
## 50                                Villanova University
## 51                                   Lehigh University
## 52                                 Syracuse University
## 53                                 University of Miami
## 54                     Ohio State University--Columbus
## 55                   Purdue University--West Lafayette
## 56                   Rutgers University--New Brunswick
## 57      Pennsylvania State University--University Park
## 58                       Southern Methodist University
## 59                            University of Washington
## 60                     Worcester Polytechnic Institute
## 61                        George Washington University
## 62                           University of Connecticut
## 63                University of Maryland--College Park
## 64                     Brigham Young University--Provo
## 65                                    Clark University
## 66                                  Clemson University
## 67               Texas A&M University--College Station
## 68                            Florida State University
## 69                                  Fordham University
## 70                     Stevens Institute of Technology
## 71                University of California--Santa Cruz
## 72                University of Massachusetts--Amherst
## 73                            University of Pittsburgh
## 74                University of Minnesota--Twin Cities
## 75                                       Virginia Tech
## 76                                 American University
## 77                                   Baylor University
## 78                         Binghamton University--SUNY
## 79                            Colorado School of Mines
## 80            North Carolina State University--Raleigh
## 81                        Stony Brook University--SUNY
## 82                          Texas Christian University
## 83                                  Yeshiva University
## 84                           Michigan State University
## 85                 University of California--Riverside
## 86                             University of San Diego
## 87                                   Howard University
## 88                     Indiana University--Bloomington
## 89                           Loyola University Chicago
## 90                                Marquette University
## 91                         University at Buffalo--SUNY
## 92                              University of Delaware
## 93                                  University of Iowa
## 94                    Illinois Institute of Technology
## 95                            Miami University--Oxford
## 96                     University of Colorado--Boulder
## 97                                University of Denver
## 98                         University of San Francisco
## 99                               University of Vermont
## 100                                Clarkson University
## 101                                  Drexel University
## 102                  Rochester Institute of Technology
## 103                               University of Oregon
## 104                 New Jersey Institute of Technology
## 105                             Saint Louis University
## 106 SUNY College of Environmental Science and Forestry
## 107                                  Temple University
##     TuitionFeesThousands RoomBoardThousands TotalEnrollment    SchoolType
## 1                 47.140             15.610            8273 Private, Coed
## 2                 50.420             17.160           20604 Private, Coed
## 3                 59.430             14.016           25968 Private, Coed
## 4                 51.832             15.510           11466 Private, Coed
## 5                 57.006             16.350           13736 Private, Coed
## 6                 53.430             16.000           12974 Private, Coed
## 7                 51.354             15.763           17178 Private, Coed
## 8                 55.960             15.944           16294 Private, Coed
## 9                 55.584             15.616           21907 Private, Coed
## 10                53.740             15.836           25151 Private, Coed
## 11                54.567             16.626           21474 Private, Coed
## 12                52.362             15.525            2238 Private, Coed
## 13                55.035             15.756            6509 Private, Coed
## 14                55.656             14.670           10095 Private, Coed
## 15                49.816             16.234           12592 Private, Coed
## 16                55.188             14.816           23016 Private, Coed
## 17                47.350             14.000            7022 Private, Coed
## 18                53.391             15.410           12467 Private, Coed
## 19                41.294             15.991           45428  Public, Coed
## 20                53.399             16.440           15303 Private, Coed
## 21                51.306             14.456           14273 Private, Coed
## 22                54.104             16.418           19005 Private, Coed
## 23                43.232             17.764           41910  Public, Coed
## 24                56.225             15.400           36487 Private, Coed
## 25                55.465             14.418           14528 Private, Coed
## 26                48.891             11.590           24360  Public, Coed
## 27                56.382             14.560           11449 Private, Coed
## 28                49.350             11.534           46002  Public, Coed
## 29                53.322             16.032            8116 Private, Coed
## 30                51.828             18.156           51123 Private, Coed
## 31                42.486             15.673           25057  Public, Coed
## 32                35.169             11.190           29911  Public, Coed
## 33                53.926             15.938           11648 Private, Coed
## 34                55.395             15.440            5722 Private, Coed
## 35                33.020             14.596           29376  Public, Coed
## 36                28.658             10.120           52669  Public, Coed
## 37                55.464             14.478           13996 Private, Coed
## 38                44.701             12.236            8740  Public, Coed
## 39                42.074             13.733           35772  Public, Coed
## 40                53.948             15.720           33355 Private, Coed
## 41                49.042             15.190           11824 Private, Coed
## 42                51.387             16.880           21489 Private, Coed
## 43                54.820             15.190           11248 Private, Coed
## 44                53.932             15.320            7710 Private, Coed
## 45                30.404             10.038           37606  Public, Coed
## 46                32.568             11.308           48216  Public, Coed
## 47                53.880             15.260            7633 Private, Coed
## 48                37.480             10.804           51525  Public, Coed
## 49                36.805             11.114           43820  Public, Coed
## 50                53.458             14.020           10983 Private, Coed
## 51                52.930             13.600            7017 Private, Coed
## 52                51.853             15.550           22484 Private, Coed
## 53                50.226             14.108           17003 Private, Coed
## 54                30.742             12.434           59837  Public, Coed
## 55                28.804             10.030           41573  Public, Coed
## 56                31.282             12.706           49577  Public, Coed
## 57                34.858             11.570           47119  Public, Coed
## 58                54.493             16.845           11789 Private, Coed
## 59                36.898             12.798           46166  Public, Coed
## 60                50.530             14.774            6642 Private, Coed
## 61                55.230             13.850           27973 Private, Coed
## 62                38.098             12.874           27578  Public, Coed
## 63                35.216             12.429           40521  Public, Coed
## 64                 5.620              7.628           34334 Private, Coed
## 65                45.730              9.170            3153 Private, Coed
## 66                36.724             10.832           24387  Public, Coed
## 67                36.636             10.436           67580  Public, Coed
## 68                21.673             10.458           41362  Public, Coed
## 69                52.248             17.969           16037 Private, Coed
## 70                52.202             15.244            6771 Private, Coed
## 71                41.963             16.407           19457  Public, Coed
## 72                34.570             13.202           30340  Public, Coed
## 73                32.052             11.050           28642  Public, Coed
## 74                30.371             10.312           51848  Public, Coed
## 75                31.304              8.408           34440  Public, Coed
## 76                48.459             14.880           13858 Private, Coed
## 77                45.542              7.800           17059 Private, Coed
## 78                24.488             15.058           17342  Public, Coed
## 79                38.584             13.169            6117  Public, Coed
## 80                28.444             11.078           34432  Public, Coed
## 81                26.934             13.698           25989  Public, Coed
## 82                46.950             12.804           10489 Private, Coed
## 83                43.500             12.250            6311 Private, Coed
## 84                39.750             10.272           50019  Public, Coed
## 85                42.879             16.000           23278  Public, Coed
## 86                49.358             12.980            8905 Private, Coed
## 87                26.756             13.895            9392 Private, Coed
## 88                35.456             10.465           43710  Public, Coed
## 89                44.048             14.480           16673 Private, Coed
## 90                41.870             12.720           11426 Private, Coed
## 91                27.758             13.723           30648  Public, Coed
## 92                34.310             12.864           22970  Public, Coed
## 93                30.609             10.450           32166  Public, Coed
## 94                47.646             13.192            7164 Private, Coed
## 95                33.577             13.031           19700  Public, Coed
## 96                37.288             14.418           35230  Public, Coed
## 97                50.556             13.005           11434 Private, Coed
## 98                48.066             14.830           11080 Private, Coed
## 99                42.516             12.462           13340  Public, Coed
## 100               49.444             15.222            4233 Private, Coed
## 101               52.002             13.890           21940 Private, Coed
## 102               44.130             13.046           15346 Private, Coed
## 103               35.478             12.963           22887  Public, Coed
## 104               31.918             13.808           11446  Public, Coed
## 105               43.996             12.290           12098 Private, Coed
## 106               18.218             16.140            2215  Public, Coed
## 107               28.426             11.566           39948  Public, Coed
##     YearFounded  Setting Endowment2017Millions
## 1          1746 Suburban               23400.0
## 2          1636    Urban               37100.0
## 3          1754    Urban               10000.0
## 4          1861    Urban               14800.0
## 5          1890    Urban                6600.0
## 6          1701     City               27200.0
## 7          1885 Suburban               24800.0
## 8          1838 Suburban                7900.0
## 9          1740    Urban               12200.0
## 10         1876    Urban                3700.0
## 11         1851 Suburban                7900.0
## 12         1891 Suburban                2600.0
## 13         1769    Rural                5000.0
## 14         1764     City                3200.0
## 15         1873    Urban                4100.0
## 16         1865    Rural                6500.0
## 17         1912    Urban                5800.0
## 18         1842     City                9700.0
## 19         1919    Urban                4200.0
## 20         1853 Suburban                7200.0
## 21         1836     City                7600.0
## 22         1789    Urban                1700.0
## 23         1868     City                4400.0
## 24         1880    Urban                5100.0
## 25         1900    Urban                1700.0
## 26         1819 Suburban                6300.0
## 27         1852 Suburban                1700.0
## 28         1817     City               10800.0
## 29         1834 Suburban                1200.0
## 30         1831    Urban                4100.0
## 31         1909 Suburban                 332.4
## 32         1789 Suburban                2900.0
## 33         1850 Suburban                2100.0
## 34         1948 Suburban                 976.9
## 35         1885    Urban                2000.0
## 36         1853 Suburban                1600.0
## 37         1863 Suburban                2300.0
## 38         1693 Suburban                 874.1
## 39         1960    Urban                1400.0
## 40         1839    Urban                2000.0
## 41         1826    Urban                1800.0
## 42         1898    Urban                 795.9
## 43         1834    Urban                1300.0
## 44         1937 Suburban                 860.3
## 45         1785     City                1200.0
## 46         1867     City                1800.0
## 47         1824 Suburban                 674.3
## 48         1883    Urban                3700.0
## 49         1848     City                3800.0
## 50         1842 Suburban                 641.3
## 51         1865     City                1300.0
## 52         1870     City                1300.0
## 53         1925 Suburban                 948.6
## 54         1870    Urban                4200.0
## 55         1869     City                2300.0
## 56         1766     City                 985.5
## 57         1855     City                2000.0
## 58         1911    Urban                1500.0
## 59         1861    Urban                3200.0
## 60         1865     City                 502.5
## 61         1821    Urban                1700.0
## 62         1881    Rural                 401.3
## 63         1856 Suburban                 548.7
## 64         1875     City                1700.0
## 65         1887     City                 408.8
## 66         1889 Suburban                 682.7
## 67         1876     City               10800.0
## 68         1851     City                 639.4
## 69         1841    Urban                 691.1
## 70         1870     City                 183.9
## 71         1965 Suburban                 188.7
## 72         1863 Suburban                 323.6
## 73         1787    Urban                3900.0
## 74         1851    Urban                3300.0
## 75         1872    Rural                 987.6
## 76         1893 Suburban                 622.0
## 77         1845     City                1200.0
## 78         1946 Suburban                 109.3
## 79         1874 Suburban                 246.1
## 80         1887     City                1100.0
## 81         1957 Suburban                 234.0
## 82         1873 Suburban                1500.0
## 83         1886    Urban                 506.2
## 84         1855 Suburban                3100.0
## 85         1954     City                 231.1
## 86         1949    Urban                 503.6
## 87         1867    Urban                 646.6
## 88         1820     City                1100.0
## 89         1870     City                 593.5
## 90         1881    Urban                 626.2
## 91         1846 Suburban                 659.2
## 92         1743 Suburban                1400.0
## 93         1847     City                1400.0
## 94         1890    Urban                 241.9
## 95         1809    Rural                 512.4
## 96         1876     City                 596.4
## 97         1864     City                 711.3
## 98         1855    Urban                 349.6
## 99         1791 Suburban                 453.3
## 100        1896    Rural                 191.1
## 101        1891    Urban                 707.6
## 102        1829 Suburban                 847.2
## 103        1876     City                 828.5
## 104        1881    Urban                 112.4
## 105        1818    Urban                1100.0
## 106        1911     City                  35.9
## 107        1884    Urban                 615.4
##     MedianStartingSalaryOfAlumniThousands    Selectivity
## 1                                    68.4 Most selective
## 2                                    66.5 Most selective
## 3                                    64.9 Most selective
## 4                                    79.8 Most selective
## 5                                    57.7 Most selective
## 6                                    63.2 Most selective
## 7                                    70.7 Most selective
## 8                                    66.2 Most selective
## 9                                    66.1 Most selective
## 10                                   63.4 Most selective
## 11                                   58.8 Most selective
## 12                                   81.0 Most selective
## 13                                   63.8 Most selective
## 14                                   60.3 Most selective
## 15                                   61.4 Most selective
## 16                                   65.0 Most selective
## 17                                   64.9 Most selective
## 18                                   62.7 Most selective
## 19                                   56.6 Most selective
## 20                                   60.3 Most selective
## 21                                   57.9 Most selective
## 22                                   57.9 Most selective
## 23                                   64.3 Most selective
## 24                                   58.1 Most selective
## 25                                   71.6 Most selective
## 26                                   59.6 Most selective
## 27                                   59.3 Most selective
## 28                                   61.9 More selective
## 29                                   54.1 Most selective
## 30                                   57.4 Most selective
## 31                                   53.8 Most selective
## 32                                   49.6 Most selective
## 33                                   54.6 More selective
## 34                                   52.9 Most selective
## 35                                   68.1 Most selective
## 36                                   52.8 Most selective
## 37                                   57.9 Most selective
## 38                                   53.4 Most selective
## 39                                   58.0 Most selective
## 40                                   55.2 Most selective
## 41                                   63.0 Most selective
## 42                                   60.1 Most selective
## 43                                   50.2 Most selective
## 44                                   52.2 More selective
## 45                                   49.9 More selective
## 46                                   59.9 More selective
## 47                                   68.4 More selective
## 48                                   56.8 More selective
## 49                                   53.7 More selective
## 50                                   61.2 Most selective
## 51                                   65.8 Most selective
## 52                                   53.6 More selective
## 53                                   53.4 More selective
## 54                                   53.1 More selective
## 55                                   60.0 More selective
## 56                                   55.6 More selective
## 57                                   56.7 More selective
## 58                                   54.9 More selective
## 59                                   56.9 More selective
## 60                                   68.8 More selective
## 61                                   53.3 More selective
## 62                                   57.2 More selective
## 63                                   58.2 More selective
## 64                                   55.8 More selective
## 65                                   46.1 More selective
## 66                                   55.4 More selective
## 67                                   57.9 More selective
## 68                                   46.4 More selective
## 69                                   53.1 More selective
## 70                                   69.0 Most selective
## 71                                   52.0 More selective
## 72                                   53.9 More selective
## 73                                   53.2 More selective
## 74                                   54.1 More selective
## 75                                   60.0 More selective
## 76                                   48.5 More selective
## 77                                   51.3 More selective
## 78                                   55.4 More selective
## 79                                   69.7 More selective
## 80                                   55.4 More selective
## 81                                   54.4 More selective
## 82                                   51.2 More selective
## 83                                   53.4 More selective
## 84                                   52.2 More selective
## 85                                   50.7 More selective
## 86                                   54.0 More selective
## 87                                   51.5      Selective
## 88                                   49.7 More selective
## 89                                   49.1 More selective
## 90                                   54.0 More selective
## 91                                   51.1 More selective
## 92                                   54.8 More selective
## 93                                   49.6 More selective
## 94                                   61.4 More selective
## 95                                   53.3 More selective
## 96                                   53.4 More selective
## 97                                   50.6 More selective
## 98                                   54.9 More selective
## 99                                   49.3 More selective
## 100                                  63.8 More selective
## 101                                  59.9 More selective
## 102                                  60.3 More selective
## 103                                  48.3      Selective
## 104                                  60.9 More selective
## 105                                  50.6 More selective
## 106                                  52.2 More selective
## 107                                  48.3 More selective
##     Fall2017AcceptanceRate MalePercentage FourYearGraduationRate
## 1                     0.06           0.51                   0.89
## 2                     0.05           0.52                   0.84
## 3                     0.06           0.52                   0.88
## 4                     0.07           0.54                   0.85
## 5                     0.09           0.51                   0.88
## 6                     0.07           0.50                   0.87
## 7                     0.05           0.50                   0.75
## 8                     0.10           0.50                   0.88
## 9                     0.09           0.49                   0.86
## 10                    0.12           0.48                   0.88
## 11                    0.09           0.50                   0.84
## 12                    0.08           0.55                   0.79
## 13                    0.10           0.51                   0.88
## 14                    0.09           0.46                   0.86
## 15                    0.11           0.49                   0.86
## 16                    0.13           0.48                   0.85
## 17                    0.16           0.53                   0.83
## 18                    0.19           0.53                   0.90
## 19                    0.16           0.43                   0.75
## 20                    0.16           0.46                   0.88
## 21                    0.22           0.40                   0.82
## 22                    0.16           0.44                   0.90
## 23                    0.17           0.47                   0.76
## 24                    0.16           0.48                   0.77
## 25                    0.22           0.51                   0.76
## 26                    0.27           0.45                   0.88
## 27                    0.15           0.49                   0.87
## 28                    0.27           0.50                   0.77
## 29                    0.28           0.46                   0.84
## 30                    0.28           0.43                   0.75
## 31                    0.33           0.46                   0.70
## 32                    0.24           0.41                   0.84
## 33                    0.34           0.50                   0.77
## 34                    0.34           0.41                   0.83
## 35                    0.23           0.62                   0.39
## 36                    0.42           0.44                   0.68
## 37                    0.32           0.47                   0.88
## 38                    0.36           0.42                   0.85
## 39                    0.34           0.51                   0.55
## 40                    0.25           0.40                   0.81
## 41                    0.33           0.55                   0.66
## 42                    0.27           0.49                   0.00
## 43                    0.21           0.41                   0.73
## 44                    0.40           0.41                   0.77
## 45                    0.54           0.43                   0.63
## 46                    0.62           0.55                   0.70
## 47                    0.43           0.68                   0.61
## 48                    0.36           0.47                   0.58
## 49                    0.54           0.49                   0.61
## 50                    0.36           0.47                   0.87
## 51                    0.25           0.55                   0.76
## 52                    0.47           0.46                   0.70
## 53                    0.36           0.48                   0.72
## 54                    0.48           0.52                   0.59
## 55                    0.57           0.57                   0.51
## 56                    0.58           0.50                   0.60
## 57                    0.50           0.53                   0.67
## 58                    0.49           0.50                   0.71
## 59                    0.46           0.47                   0.65
## 60                    0.48           0.64                   0.82
## 61                    0.41           0.40                   0.73
## 62                    0.48           0.50                   0.70
## 63                    0.44           0.53                   0.67
## 64                    0.52           0.51                   0.23
## 65                    0.56           0.39                   0.77
## 66                    0.47           0.51                   0.59
## 67                    0.70           0.52                   0.54
## 68                    0.49           0.44                   0.63
## 69                    0.46           0.42                   0.74
## 70                    0.44           0.70                   0.42
## 71                    0.51           0.50                   0.55
## 72                    0.57           0.50                   0.67
## 73                    0.60           0.49                   0.65
## 74                    0.50           0.47                   0.64
## 75                    0.70           0.57                   0.63
## 76                    0.29           0.38                   0.76
## 77                    0.39           0.41                   0.60
## 78                    0.40           0.51                   0.73
## 79                    0.56           0.71                   0.55
## 80                    0.51           0.55                   0.50
## 81                    0.42           0.53                   0.53
## 82                    0.41           0.41                   0.69
## 83                    0.63           0.53                   0.76
## 84                    0.72           0.49                   0.52
## 85                    0.57           0.46                   0.53
## 86                    0.50           0.46                   0.70
## 87                    0.41           0.31                   0.43
## 88                    0.76           0.51                   0.63
## 89                    0.71           0.34                   0.69
## 90                    0.89           0.47                   0.59
## 91                    0.57           0.57                   0.57
## 92                    0.60           0.43                   0.73
## 93                    0.86           0.47                   0.54
## 94                    0.54           0.69                   0.32
## 95                    0.68           0.50                   0.67
## 96                    0.80           0.56                   0.45
## 97                    0.58           0.47                   0.65
## 98                    0.66           0.38                   0.67
## 99                    0.67           0.42                   0.62
## 100                   0.66           0.70                   0.58
## 101                   0.79           0.50                   0.00
## 102                   0.57           0.67                   0.28
## 103                   0.83           0.46                   0.52
## 104                   0.61           0.74                   0.28
## 105                   0.64           0.40                   0.66
## 106                   0.52           0.53                   0.00
## 107                   0.57           0.47                   0.45
##     TotalCostThousands
## 1               62.750
## 2               67.580
## 3               73.446
## 4               67.342
## 5               73.356
## 6               69.430
## 7               67.117
## 8               71.904
## 9               71.200
## 10              69.576
## 11              71.193
## 12              67.887
## 13              70.791
## 14              70.326
## 15              66.050
## 16              70.004
## 17              61.350
## 18              68.801
## 19              57.285
## 20              69.839
## 21              65.762
## 22              70.522
## 23              60.996
## 24              71.625
## 25              69.883
## 26              60.481
## 27              70.942
## 28              60.884
## 29              69.354
## 30              69.984
## 31              58.159
## 32              46.359
## 33              69.864
## 34              70.835
## 35              47.616
## 36              38.778
## 37              69.942
## 38              56.937
## 39              55.807
## 40              69.668
## 41              64.232
## 42              68.267
## 43              70.010
## 44              69.252
## 45              40.442
## 46              43.876
## 47              69.140
## 48              48.284
## 49              47.919
## 50              67.478
## 51              66.530
## 52              67.403
## 53              64.334
## 54              43.176
## 55              38.834
## 56              43.988
## 57              46.428
## 58              71.338
## 59              49.696
## 60              65.304
## 61              69.080
## 62              50.972
## 63              47.645
## 64              13.248
## 65              54.900
## 66              47.556
## 67              47.072
## 68              32.131
## 69              70.217
## 70              67.446
## 71              58.370
## 72              47.772
## 73              43.102
## 74              40.683
## 75              39.712
## 76              63.339
## 77              53.342
## 78              39.546
## 79              51.753
## 80              39.522
## 81              40.632
## 82              59.754
## 83              55.750
## 84              50.022
## 85              58.879
## 86              62.338
## 87              40.651
## 88              45.921
## 89              58.528
## 90              54.590
## 91              41.481
## 92              47.174
## 93              41.059
## 94              60.838
## 95              46.608
## 96              51.706
## 97              63.561
## 98              62.896
## 99              54.978
## 100             64.666
## 101             65.892
## 102             57.176
## 103             48.441
## 104             45.726
## 105             56.286
## 106             34.358
## 107             39.992

>>>>>>> 0604266734446cc7af126243c4a2a0b17213beeb

Part 2: Data Visualization

We plot the data in order to visualize relationships among the attributes.

#Starting Salary
#-histograms
library(ggplot2)
plot_1 <- formatted_college_tab %>%
  ggplot(aes(MedianStartingSalaryOfAlumniThousands)) +
    geom_histogram()+ 
    labs(title="Starting Salary Distribution", x="Median Starting Salary of Alumni (Thousands)", y="Count")
plot_1
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

The distribution of the median starting salary of alumni from all the school seems to be a bell-shaped curve (a little skewed right), centering around $55,000.

#Tuition Cost
#-histograms
library(ggplot2)
plot_2 <- formatted_college_tab %>%
  ggplot(aes(TuitionFeesThousands)) +
    geom_histogram()+ 
        labs(title="Tuition Cost Distribution", x="Tuition Cost (Thousands)", y="Count")
plot_2
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

The distribution of tution costs of all the schools is skewed left, with a range of $60,000.

#Acceptance rate vs graduation rate

library(ggplot2)
plot_3 <- formatted_college_tab %>%
  ggplot(aes(x=Fall2017AcceptanceRate, y=FourYearGraduationRate)) +
    geom_point()+ 
    geom_smooth(method=lm)+
        labs(title="Acceptance Vs. Graduation Rate", x="Fall 2017 Acceptance Rate", y="Four Year Graduation Rate")
plot_3

There is a linear relationship between acceptance rate (Fall 2017) and the four year graduation rate. It is an overall negative relationship. The higher the acceptance rate, the lower the rate of graduation.

#Boxplots of (1) gradruattion rate & (2) admission rate by selectivity 
library(ggplot2)
formatted_college_tab$Selectivity <- factor(formatted_college_tab$Selectivity, c("Selective","More selective","Most selective"))
plot_4 <-  formatted_college_tab %>%
  ggplot(aes(x=Selectivity, y=FourYearGraduationRate)) +
    geom_boxplot()+
        labs(title="Graduation Rate based on Selectivity", x="Selectivity Level", y="Four Year Graduation Rate")
plot_4

This is significant difference in four year graduation rates based on their Selectivity Level of accepting students. These boxplots show that each 3 selectivity level vary significantly on range and central tendency. The more selective a college is, the greater their graduation rates seem to be.

#Setting vs. room board

library(ggplot2)
formatted_college_tab$Setting <- factor(formatted_college_tab$Setting, c("Rural","Suburban","Urban", "City"))
plot_5 <- formatted_college_tab %>%
  ggplot(aes(x=Setting, y=RoomBoardThousands)) +
    geom_boxplot()+
        labs(title="Setting vs. Room & Board Costs", x="Setting", y="Room & Board Costs (Thousands)")
plot_5

The boxplots of room & board costs based on setting shows that the setting of the college has some influence the room and board costs for the students. The median room and board costs of the City settingvary from that of the others. The spread is also greater for the City setting while it is much smaller for the rural setting.

plot_6 <- formatted_college_tab %>%
  
  ggplot(aes(x=TotalCostThousands, y=MedianStartingSalaryOfAlumniThousands)) +
    geom_point()+ 
    geom_smooth(method=lm)+
        labs(title="Total Cost vs. Median Starting Salary", x="Total Cost (Thousand)", y="Median Starting Salary Of Alumni (Thousands)")
plot_6

There appears to be a positive linear relationship between median starting salary and total cost of colleges. The general trends shows that the more students spend on tution, room, and board, the more likely that their starting salary is higher.

plot_7 <- formatted_college_tab %>%
  ggplot(aes(x=SchoolType, y=MedianStartingSalaryOfAlumniThousands  
)) +
    geom_boxplot()+
        labs(title="Median Starting Salary Of Alumni Based on School Type  ", x="School Type", y="Median Starting Salary Of Alumni (Thousands)")
plot_7

Between school types, private colleges seem to have greater starting salaries than public schools, based on the medians of these boxplots.

 formatted_college_tab %>% group_by(Selectivity) %>%
  summarise(n())
## # A tibble: 3 x 2
##   Selectivity    `n()`
##   <fct>          <int>
## 1 Selective          2
## 2 More selective    61
## 3 Most selective    44
plot_8 <- formatted_college_tab %>%
  ggplot(aes(x=MalePercentage, y=MedianStartingSalaryOfAlumniThousands  
)) +
    geom_point()+
  geom_smooth(method=lm)+
        labs(title="Male Percentage vs. Median Starting Salary of Alumni  ", x="Male Percentage", y="Median Starting Salary Of Alumni (Thousands)")
plot_8

Although the points are scattered with some variation, there is a general positive correlation between median starting salary of alumni and the male percentage of the student body of colleges.


Part 3: Model Fitting and Selection

Fitting model for tuition prices

#adjusting dataset to remove variables not able to be used in model fitting
college_info <- formatted_college_tab[,-c(1,2)]
head(college_info)
##   TuitionFeesThousands RoomBoardThousands TotalEnrollment    SchoolType
## 1               47.140             15.610            8273 Private, Coed
## 2               50.420             17.160           20604 Private, Coed
## 3               59.430             14.016           25968 Private, Coed
## 4               51.832             15.510           11466 Private, Coed
## 5               57.006             16.350           13736 Private, Coed
## 6               53.430             16.000           12974 Private, Coed
##   YearFounded  Setting Endowment2017Millions
## 1        1746 Suburban                 23400
## 2        1636    Urban                 37100
## 3        1754    Urban                 10000
## 4        1861    Urban                 14800
## 5        1890    Urban                  6600
## 6        1701     City                 27200
##   MedianStartingSalaryOfAlumniThousands    Selectivity
## 1                                  68.4 Most selective
## 2                                  66.5 Most selective
## 3                                  64.9 Most selective
## 4                                  79.8 Most selective
## 5                                  57.7 Most selective
## 6                                  63.2 Most selective
##   Fall2017AcceptanceRate MalePercentage FourYearGraduationRate
## 1                   0.06           0.51                   0.89
## 2                   0.05           0.52                   0.84
## 3                   0.06           0.52                   0.88
## 4                   0.07           0.54                   0.85
## 5                   0.09           0.51                   0.88
## 6                   0.07           0.50                   0.87
##   TotalCostThousands
## 1             62.750
## 2             67.580
## 3             73.446
## 4             67.342
## 5             73.356
## 6             69.430
college_info$FourYearGraduationRate <- college_info$FourYearGraduationRate*100
college_info$MalePercentage <- college_info$MalePercentage*100
college_info$Fall2017AcceptanceRate <- college_info$Fall2017AcceptanceRate*100
#linear model fitting 
tuition_lm_1 <- lm(TuitionFeesThousands~.-RoomBoardThousands-TotalCostThousands, data = college_info)
summary(tuition_lm_1)
## 
## Call:
## lm(formula = TuitionFeesThousands ~ . - RoomBoardThousands - 
##     TotalCostThousands, data = college_info)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -32.860  -2.743   0.146   3.237  14.500 
## 
## Coefficients:
##                                         Estimate Std. Error t value
## (Intercept)                            8.311e+00  2.691e+01   0.309
## TotalEnrollment                       -3.283e-05  5.921e-05  -0.555
## SchoolTypePublic, Coed                -1.150e+01  1.909e+00  -6.021
## YearFounded                            4.876e-03  1.305e-02   0.374
## SettingSuburban                        9.797e-01  2.825e+00   0.347
## SettingUrban                           1.483e+00  2.899e+00   0.512
## SettingCity                           -2.112e-01  2.878e+00  -0.073
## Endowment2017Millions                 -5.211e-05  1.483e-04  -0.351
## MedianStartingSalaryOfAlumniThousands  1.396e-01  1.705e-01   0.819
## SelectivityMore selective              6.810e+00  4.740e+00   1.437
## SelectivityMost selective              1.136e+01  5.175e+00   2.194
## Fall2017AcceptanceRate                 5.185e-02  5.897e-02   0.879
## MalePercentage                         2.208e-02  1.319e-01   0.167
## FourYearGraduationRate                 1.754e-01  4.529e-02   3.873
##                                       Pr(>|t|)    
## (Intercept)                             0.7581    
## TotalEnrollment                         0.5806    
## SchoolTypePublic, Coed                3.42e-08 ***
## YearFounded                             0.7095    
## SettingSuburban                         0.7295    
## SettingUrban                            0.6100    
## SettingCity                             0.9417    
## Endowment2017Millions                   0.7262    
## MedianStartingSalaryOfAlumniThousands   0.4151    
## SelectivityMore selective               0.1541    
## SelectivityMost selective               0.0307 *  
## Fall2017AcceptanceRate                  0.3815    
## MalePercentage                          0.8674    
## FourYearGraduationRate                  0.0002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6.182 on 93 degrees of freedom
## Multiple R-squared:  0.7003, Adjusted R-squared:  0.6584 
## F-statistic: 16.72 on 13 and 93 DF,  p-value: < 2.2e-16
par(mfrow=c(2,2))
plot(tuition_lm_1)

tuition_lm_2 <- step(tuition_lm_1, direction = "both", steps = 1000, trace = F)
summary(tuition_lm_2)
## 
## Call:
## lm(formula = TuitionFeesThousands ~ SchoolType + Selectivity + 
##     FourYearGraduationRate, data = college_info)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -35.663  -2.870   0.402   3.025  14.015 
## 
## Coefficients:
##                            Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                30.51347    4.69313   6.502 2.98e-09 ***
## SchoolTypePublic, Coed    -12.40552    1.29191  -9.602 6.18e-16 ***
## SelectivityMore selective   7.47358    4.37232   1.709 0.090437 .  
## SelectivityMost selective  11.51048    4.51421   2.550 0.012265 *  
## FourYearGraduationRate      0.14329    0.03637   3.940 0.000149 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6.049 on 102 degrees of freedom
## Multiple R-squared:  0.6854, Adjusted R-squared:  0.6731 
## F-statistic: 55.55 on 4 and 102 DF,  p-value: < 2.2e-16
plot(tuition_lm_2)

anova(tuition_lm_2,tuition_lm_1, test="Chisq")
## Analysis of Variance Table
## 
## Model 1: TuitionFeesThousands ~ SchoolType + Selectivity + FourYearGraduationRate
## Model 2: TuitionFeesThousands ~ (RoomBoardThousands + TotalEnrollment + 
##     SchoolType + YearFounded + Setting + Endowment2017Millions + 
##     MedianStartingSalaryOfAlumniThousands + Selectivity + Fall2017AcceptanceRate + 
##     MalePercentage + FourYearGraduationRate + TotalCostThousands) - 
##     RoomBoardThousands - TotalCostThousands
##   Res.Df    RSS Df Sum of Sq Pr(>Chi)
## 1    102 3731.7                      
## 2     93 3554.7  9    176.99   0.8653

Fitting model for graduation rate

#linear model fitting 
gradrate_lm_1 <- lm(MedianStartingSalaryOfAlumniThousands~.-TuitionFeesThousands-RoomBoardThousands, data = na.omit(college_info))
summary(gradrate_lm_1)
## 
## Call:
## lm(formula = MedianStartingSalaryOfAlumniThousands ~ . - TuitionFeesThousands - 
##     RoomBoardThousands, data = na.omit(college_info))
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7.3062 -2.3889 -0.2445  1.8739 15.6698 
## 
## Coefficients:
##                             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                2.371e+01  1.616e+01   1.468   0.1455    
## TotalEnrollment            8.453e-06  3.614e-05   0.234   0.8156    
## SchoolTypePublic, Coed    -1.430e+00  1.325e+00  -1.079   0.2833    
## YearFounded                5.017e-03  7.928e-03   0.633   0.5284    
## SettingSuburban           -1.682e+00  1.707e+00  -0.986   0.3269    
## SettingUrban              -1.043e+00  1.760e+00  -0.593   0.5547    
## SettingCity               -1.384e+00  1.741e+00  -0.795   0.4286    
## Endowment2017Millions      2.166e-04  8.718e-05   2.485   0.0147 *  
## SelectivityMore selective -2.567e+00  2.885e+00  -0.890   0.3759    
## SelectivityMost selective -8.552e-02  3.207e+00  -0.027   0.9788    
## Fall2017AcceptanceRate    -8.235e-02  3.478e-02  -2.368   0.0200 *  
## MalePercentage             5.486e-01  5.637e-02   9.732 7.55e-16 ***
## FourYearGraduationRate     1.843e-02  2.907e-02   0.634   0.5276    
## TotalCostThousands         3.458e-02  5.496e-02   0.629   0.5307    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.752 on 93 degrees of freedom
## Multiple R-squared:  0.7248, Adjusted R-squared:  0.6864 
## F-statistic: 18.85 on 13 and 93 DF,  p-value: < 2.2e-16
plot(gradrate_lm_1)

gradrate_lm_2 <- step(gradrate_lm_1, direction = "both", steps = 1000, trace = F)
summary(gradrate_lm_2)
## 
## Call:
## lm(formula = MedianStartingSalaryOfAlumniThousands ~ SchoolType + 
##     Endowment2017Millions + Selectivity + Fall2017AcceptanceRate + 
##     MalePercentage, data = na.omit(college_info))
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -6.5908 -2.3830 -0.3075  1.8991 15.2461 
## 
## Coefficients:
##                             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                3.545e+01  3.638e+00   9.744  3.6e-16 ***
## SchoolTypePublic, Coed    -1.912e+00  8.010e-01  -2.386  0.01889 *  
## Endowment2017Millions      1.998e-04  7.306e-05   2.735  0.00738 ** 
## SelectivityMore selective -2.042e+00  2.713e+00  -0.752  0.45357    
## SelectivityMost selective  6.553e-01  2.981e+00   0.220  0.82643    
## Fall2017AcceptanceRate    -9.100e-02  3.173e-02  -2.868  0.00504 ** 
## MalePercentage             5.428e-01  4.828e-02  11.243  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.672 on 100 degrees of freedom
## Multiple R-squared:  0.7166, Adjusted R-squared:  0.6996 
## F-statistic: 42.15 on 6 and 100 DF,  p-value: < 2.2e-16
plot(gradrate_lm_2)

anova(gradrate_lm_2,gradrate_lm_1, test="Chisq")
## Analysis of Variance Table
## 
## Model 1: MedianStartingSalaryOfAlumniThousands ~ SchoolType + Endowment2017Millions + 
##     Selectivity + Fall2017AcceptanceRate + MalePercentage
## Model 2: MedianStartingSalaryOfAlumniThousands ~ (TuitionFeesThousands + 
##     RoomBoardThousands + TotalEnrollment + SchoolType + YearFounded + 
##     Setting + Endowment2017Millions + Selectivity + Fall2017AcceptanceRate + 
##     MalePercentage + FourYearGraduationRate + TotalCostThousands) - 
##     TuitionFeesThousands - RoomBoardThousands
##   Res.Df    RSS Df Sum of Sq Pr(>Chi)
## 1    100 1348.5                      
## 2     93 1309.3  7     39.15   0.9045

Conclusion

Being aware of all these factors in succeeding in college is very important when deciding where to go.

References: -College Ranking Data: https://www.usnews.com/best-colleges/rankings/national-universities

Conclusion

Being aware of all these factors in succeeding in college is very important when deciding where to go.

References: -College Ranking Data: https://www.usnews.com/best-colleges/rankings/national-universities